Search CORE

4,284 research outputs found

Dynamic communicability and epidemic spread: a case study on an empirical dynamic contact network

Author: Benzi Michele
Chang Howard H.
Chen Isabel
Hertzberg Vicki S.
Publication venue
Publication date: 01/01/2016
Field of study

We analyze a recently proposed temporal centrality measure applied to an empirical network based on person-to-person contacts in an emergency department of a busy urban hospital. We show that temporal centrality identifies a distinct set of top-spreaders than centrality based on the time-aggregated binarized contact matrix, so that taken together, the accuracy of capturing top-spreaders improves significantly. However, with respect to predicting epidemic outcome, the temporal measure does not necessarily outperform less complex measures. Our results also show that other temporal markers such as duration observed and the time of first appearance in the the network can be used in a simple predictive model to generate predictions that capture the trend of the observed data remarkably well.Comment: 31 pages, 15 figures, 11 tables; typos corrected; references added; Figure 3 added; some changes to the conclusion and introductio

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Scuola Normale Superiore

A Logitudinal Feature Selection Method Identifies Relevant Genes to Distinguish Complicated Injury and Uncomplicated Injury Over Time

Author: Chang Howard H.
Tian Suyan
Wang Chi
Publication venue: UKnowledge
Publication date: 07/12/2018
Field of study

Background: Feature selection and gene set analysis are of increasing interest in the field of bioinformatics. While these two approaches have been developed for different purposes, we describe how some gene set analysis methods can be utilized to conduct feature selection. Methods: We adopted a gene set analysis method, the significance analysis of microarray gene set reduction (SAMGSR) algorithm, to carry out feature selection for longitudinal gene expression data. Results: Using a real-world application and simulated data, it is demonstrated that the proposed SAMGSR extension outperforms other relevant methods. In this study, we illustrate that a gene’s expression profiles over time can be regarded as a gene set and then a suitable gene set analysis method can be utilized directly to select relevant genes associated with the phenotype of interest over time. Conclusions: We believe this work will motivate more research to bridge feature selection and gene set analysis, with the development of novel algorithms capable of carrying out feature selection for longitudinal gene expression data

University of Kentucky

Weighted-SAMGSR: Combining Significance Analysis of Microarray-Gene Set Reduction Algorithm with Pathway Topology-Based Weights to Select Relevant Genes

Author: Chang Howard H.
Tian Suyan
Wang Chi
Publication venue: UKnowledge
Publication date: 12/05/2016
Field of study

Background: It has been demonstrated that a pathway-based feature selection method that incorporates biological information within pathways during the process of feature selection usually outperforms a gene-based feature selection algorithm in terms of predictive accuracy and stability. Significance analysis of microarray-gene set reduction algorithm (SAMGSR), an extension to a gene set analysis method with further reduction of the selected pathways to their respective core subsets, can be regarded as a pathway-based feature selection method. Methods: In SAMGSR, whether a gene is selected is mainly determined by its expression difference between the phenotypes, and partially by the number of pathways to which this gene belongs. It ignores the topology information among pathways. In this study, we propose a weighted version of the SAMGSR algorithm by constructing weights based on the connectivity among genes and then combing these weights with the test statistics. Results: Using both simulated and real-world data, we evaluate the performance of the proposed SAMGSR extension and demonstrate that the weighted version outperforms its original version. Conclusions: To conclude, the additional gene connectivity information does faciliatate feature selection

arXiv.org e-Print Archive

PubMed Central

University of Kentucky

Bayesian Model Averaging for Clustered Data: Imputing Missing Daily Air Pollution Concentration

Author: Chang Howard H
Dominici Francesca
Peng Roger D
Publication venue: Collection of Biostatistics Research Archive
Publication date: 15/12/2008
Field of study

The presence of missing observations is a challenge in statistical analysis especially when data are clustered. In this paper, we develop a Bayesian model averaging (BMA) approach for imputing missing observations in clustered data. Our approach extends BMA by allowing the weights of competing regression models for missing data imputation to vary between clusters while borrowing information across clusters in estimating model parameters. Through simulation and cross-validation studies, we demonstrate that our approach outperforms the standard BMA imputation approach where model weights are assumed to be the same for all clusters. We then apply our proposed method to a national dataset of daily ambient coarse particulate matter (PM10-2.5) concentration between 2003 and 2005. We impute missing daily monitor-level PM10-2.5 measurements and estimate the posterior probability of PM10-2.5 nonattainment status for 95 US counties based on the Environmental Protection Agency\u27s proposed 24-hour standard

Collection Of Biostatistics Research Archive

Minimum energy as the general form of critical flow and maximum flow efficiency and for explaining variations in river channel pattern

Author: Chang Howard H
Huang He Qing
Nanson Gerald
Publication venue: 'Sociological Research Online'
Publication date: 01/01/2004
Field of study

Although the Bélanger-Böss theorem of critical flow has been widely applied in open channel hydraulics, it was derived from the laws governing ideal frictionless flow. This study explores a more general expression of this theorem and examines its applicability to flow with friction and sediment transport. It demonstrates that the theorem can be more generally presented as the principle of minimum energy (PME), with maximum efficiency of energy use and minimum friction or minimum energy dissipation as its equivalents. Critical flow depth under frictionless conditions, the best hydraulic section where friction is introduced, and the most efficient alluvial channel geometry where both friction and sediment transport apply are all shown to be the products of PME. Because PME in liquids characterizes the stationary state of motion in solid materials, flow tends to rapidly expend excess energy when more than minimally demanded energy is available. This leads to the formation of relatively stable but dynamic energy-consuming meandering and braided channel planforms and explains the existence of various extremal hypotheses

Research Online

Identification of Prognostic Genes and Gene Sets for Early-Stage Non-Small Cell Lung Cancer Using Bi-Level Selection Methods

Author: Chang Howard H.
Sun Jianguo
Tian Suyan
Wang Chi
Publication venue: UKnowledge
Publication date: 07/04/2017
Field of study

In contrast to feature selection and gene set analysis, bi-level selection is a process of selecting not only important gene sets but also important genes within those gene sets. Depending on the order of selections, a bi-level selection method can be classified into three categories – forward selection, which first selects relevant gene sets followed by the selection of relevant individual genes; backward selection which takes the reversed order; and simultaneous selection, which performs the two tasks simultaneously usually with the aids of a penalized regression model. To test the existence of subtype-specific prognostic genes for non-small cell lung cancer (NSCLC), we had previously proposed the Cox-filter method that examines the association between patients’ survival time after diagnosis with one specific gene, the disease subtypes, and their interaction terms. In this study, we further extend it to carry out forward and backward bi-level selection. Using simulations and a NSCLC application, we demonstrate that the forward selection outperforms the backward selection and other relevant algorithms in our setting. Both proposed methods are readily understandable and interpretable. Therefore, they represent useful tools for the researchers who are interested in exploring the prognostic value of gene expression data for specific subtypes or stages of a disease

PubMed Central

University of Kentucky

Direct Numerical Simulation of the Sedimentation of Solid Particles with Thermal Convection

Author: Chang Jianzhong
Feng James J.
Gan Hui
Hu Howard H.
Publication venue: ScholarlyCommons
Publication date: 01/01/2003
Field of study

Dispersed two-phase flows often involve interfacial activities such as chemical reaction and phase change, which couple the fluid dynamics with heat and mass transfer. As a step toward understanding such problems, we numerically simulate the sedimentation of solid bodies in a Newtonian fluid with convection heat transfer. The two-dimensional Navier–Stokes and energy equations are solved at moderate Reynolds numbers by a finite-element method, and the motion of solid particles is tracked using an arbitrary Lagrangian–Eulerian scheme. Results show that thermal convection may fundamentally change the way that particles move and interact. For a single particle settling in a channel, various Grashof-number regimes are identified, where the particle may settle straight down or migrate toward a wall or oscillate laterally. A pair of particles tend to separate if they are colder than the fluid and aggregate if they are hotter. These effects are analysed in terms of the competition between the thermal convection and the external flow relative to the particle. The mechanisms thus revealed have interesting implications for the formation of microstructures in interfacially active two-phase flows

ScholarlyCommons@Penn